4CCS1DST, 2012/13 - Lecture 10 - Sorting Algorithms 



Lecture 10: 



Sorting Algorithms 



(Chapter 11, Sections 11.1, 11.2, 
11.3 from the book) 
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Sorting Problem 





Sort a given sequence of data items according to their keys. 
■ Examples 



1) Input: (15, ...), (26, ...), (11, ...), (23, ...), (7, ...), (31, ...), (30, ...) 
Output: (7, ...), (11, ■■■), (15, ...), (23, ...), (26, ...), (30, ...), (31, ...) 

2) Input: ("go", ...), ("did", ...), ("me", ...), ("bet", ...), ("kit", ...) 
Output: ("bet", ...), ("did", ...), Cgo", ...), ("kit", ...), fme", ...) 



Sorting is a fundamental application for computers. 

Sorting is perhaps the most intensively studied and important 
operation in computer science. 

An initial sort of the data can significantly enhance the 
performance of an algorithm. 
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Merge-Sort 

i 7 

# Merge-sort on an input sequence S with n 
elements consists of three steps: 

■ Divide: If S has zero or one element, return S 
immediately. Otherwise, remove all the elements 
from S and put them into two sequences, S x and 
S 2 , each containing about half of the elements of S; 
S\ contains the first[n/2l elements of S and S 2 
contains the remaining [n/2\ elements 

■ Recur: recursively sort Sj and S 2 

■ Conquer: merge S x and S 2 into a sorted sequence S 

# The base case for the recursion are subproblems of 
size 0 or 1 

[n/2l - ceiling of x- the smallest integer m, such that x<=m, e.g:[5/2l = 3 
[n/2j - floor of x- the largest integer k r such that k<=x f e.g: L5/2J = 2 

: : :::::: : : : : : : : 
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Merging Two Sorted "TTT 
List-based Sequences 

Algorithm merge (A, B, S) 

Input sorted sequences A and B and an empty 
sequence S implemented as linked list 

Output sorted sequence S =A\J B 

while -A.is Empty () A -i?. is Empty () 

if A.firstQ.elementQ <= B.firstQ.elementQ 

{move the first element of A at the end of 5} 
S.addLast(A. remove (A.firstQ)) 
else 

{move the first element of B at the end of 5} 

S.addLast(B.remove(B.firstQ)) 

{move the remaining elements of A to S] 

while -lA.isEmptyQ 

S.addLast(A. remove (A.firstQ)) 

{move the remaining elements of B to S] 

while -i?. is Empty () 

S.addLast(B. remove (B.firstQ)) 

return S 

— V^V ^-VJVJ I UUUUI H_l I, — IUI I iujjiu => 




Merging Two Sorted 
Array-based Sequences 

Algorithm merge {A, B, S) 

Input sorted sequences A and B and an empty 
sequence 5, all of which are implemented as arrays 

Output sorted sequence S = A kj B 

while i<A.size() A j<B.size() do 
if A.get(i) <= B.getij) 

{copy z'th element of A to the end of S} 

S.addLast(A.get(i)) 

i<- z'+l 

else 

S.addLast(B.get(j)) 

{copy jth element of B to the end of S} 

{move the remaining elements of A to S] 
while i<A.size() do S.addLast(A.get(i)); /<— z'+l 
{move the remaining elements of B to S] 
while j<B.size() do S.addLast(B.get(j)); j<- j+1 
return 5 
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iv) 



Merge-Sort Tree 



^7 



An execution of merge-sort is depicted by a binary tree 



each node represents a recursive call of merge-sort and stores 

♦ unsorted sequence before the execution and its partition 

♦ sorted sequence at the end of the execution 
the root is the initial call 

the leaves are calls on subsequences of size 0 or 1 

Sequence: (7,2,9,4) 
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Execution Example (cont.) 
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Recursive call, base case 
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Recursive call, base case 
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Execution Example (cont.) 
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Recursive call, base case, merge 
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Execution Example (cont.) 
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Recursive call, merge, merge 
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Analysis of Merge-Sort 



♦ The height h of the merge-sort tree is 0(log n) 

■ at each recursive call we divide in half the sequence, 

♦ The overall amount or work done at the nodes of depth i is 0(n) 

■ we partition and merge 2? sequences of size nil 1 

■ we make 2 l+1 recursive calls 

♦ Thus, the total running time of merge-sort is Oin log n) 
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Quick-Sort 



♦ Quick-sort is a randomized sorting algorithm 
based on the divide-and-conquer paradicp: 

■ Divide: If S has at least two elements 
(otherwise nothing needs to be done) pick 
a random element* (called pivot). Remove 
all the elements from S and put into three sequences: 

♦ L elements less than x 

♦ E elements equal x 

♦ G elements greater than x 

m Recur: sort L and G 

• Conquer: Put back the elements 

into S in order by first inserting the elements of L, 
then those of E and finally those of G. 

# Common Practice: choose the pivot □ □ D 
to be the last element in S 



UcmLm 



L 




23 



Partition 



♦ We partition an input 
sequence as follows: 

■ We remove, in turn, each 
element j from S and 

■ We insert j into L, E or G, 
depending on the result of 
the comparison with the 
pivot x 



♦ Each insertion and removal 
is at the beginning or at the 
end of a sequence, and 
hence takes 0(1) time 

♦ Thus, the partition step of 
quick-sort takes 0{n) time 
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Quick 



Algorithm partition(S, p) 

Input sequence S, position p of pivot 

Output subsequences L, E, G of the 
elements of S less than, equal to, 
or greater than the pivot, resp. 

L, E, G <— empty sequences 

x <r- S.removeip) 

E.addLast(x) 

while -iS. is Empty ■() 

y <— S.remove(S.firstQ) 

ify<x 

L.addLastiy) 

else if j =x 

E.addLastiy) 

else { y > x } 

G.addLastiy) 

return L, E, G 



Quick-Sort Tree 

r 



# An execution of quick-sort is depicted by a binary tree 

■ Each node represents a recursive call of quick-sort and stores 

♦ Unsorted sequence before the execution and its pivot 

♦ Sorted sequence at the end of the execution 

■ The root is the initial call 



The leaves are calls on subsequences of size 0 or 1 
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Execution Example (cont.) 
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T7 



Partition, recursive call, base case 




© 2004 Goodrich, Tamassia 



: : 



: 



Execution Example (cont.) 

~X.y 



Recursive call, base case, join 
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Execution Example (cont.) 

; : ; : ; ; ; : : ; : ; 



Recursive call, pivot selection 
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Execution Example (cont.) 

~X.y 



Partition, recursive call, base case 
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Execution Example (cont.) 
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: : ; . ; ; ; ; ; : ; 



Join, join 



72943761^12346779 
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Worst-case Running Time 

r 



♦ The worst case for quick-sort occurs when the pivot is the unique 
minimum or maximum element 

♦ One of L and G has size n - 1 and the other has size 0 

♦ The running time is proportional to the sum 

H j- 1|) +| ...| + 2 +1 1 



♦ Thus, the worst-case running time of quick-sort is 0(n 2 ) 

depth time 
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Expected Running Time 

r 



♦ Consider a recursive call of quick-sort on a sequence of size s 

■ Good call: the sizes of L and G are each less than 3s/4 

■ Bad call: one of L and G has size greater than 3s/4 



I 2 4 3 1 




Good call 



# A call is good with probability 1/2 

i 1/2 of the possible pivots cause good calls: 



1234 


5 6 7 8 9 10 11 12 13 14 15 16 


l JV J 





Bad pivots Good pivots Bad pivots 
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Expected Running Time, Part 2 



♦ Probabilistic Fact: The expected number of coin tosses required in 
order to get k heads is 2k 



# For a node of depth i, we expect 
ill ancestors are good calls 

The size of the input sequence for the current call is at most (3/4) l/2 w 

Therefore, we have expected height 

■ For a node of depth 21og 4/3 w, 
the expected input size is one 

The expected height of the 
quick-sort tree is 0(log n) 

♦ The amount or work done at the 
nodes of the same depth is 0(n) 



time per level 

0(n) 







♦ 


Thus, the expected runnin 
of quick-sort is 0(n log n) 
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Quick-Sort 



total expected time: 0(n log n) 
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Exercise - Merge-sort and Quick-sort 



Perform merge-sort and quick-sort on 
the following sequence of numbers: 



(3, 5, 1, 9, 3) 
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In-Place Quick-Sort - 
optimisation of "divide" step 




Quick-sort can be implemented to run in-place 

"Divide" step can be done "in place", that is, without using any 
additional array, in the following way. 

♦ Assume we want to divide A [leftEnd. . .rightEnd] with respect to 
pivot A [rightEnd] . 

4 Maintain two indices, L (left cursor) and R (Right cursor), with 
initial values leftEnd and rightEnd-1, respectively. The elements 
which have not been considered yet are in A[(L+1)..(R-1)]- 
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In-Place Quick-Sort - 
optimisation of "divide" step 
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□ Iterate until L<=R\ 

Keep increasing L by 1 until an element A [L] is smaller than the 
pivot and L<=R. 



Keep decreasing R by 1 until an element A [R] is larger than the 
pivot and L<=R. 

If L<R swap A[L] and A[R], and proceed to the next iteration. 
□ Swap A [L] and pivot 
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Iteration 1: 
swap A [L] and A[R] 



L 

JSl- 



85 24 63 45 17 31 96 50 



R pivot 



: 



L 



swap 



R 



pivot 



85 


24 


63 


45 


17 


31 


96 


50 


L 










R 


pivot 










31 


24 


63 


45 


17 


85 


96 


50 



<4 



Iteration 2: 
swap A [L] and A[/f] 
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Iteration 3: 
L<R - swap A[L] and A[R] 
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Summary of Sorting Algorithms 

r 



Algorithm 


Time 


Notes 


selection-sort 


0(n 2 ) 


■ in-place 

■ slow (good for small inputs) 


insertion-sort 


0(n 2 ) 


■ in-place 

■ slow (good for small inputs) 


quick-sort 


0(n log n) 

expected 


■ in-place 

■ fastest (good for large inputs) 


heap-sort 


0(n log n) 


■ in-place 

■ fast (good for large inputs) 


merge-sort 


0(n log n) 


■ sequential data access 

■ fast (good for huge inputs) 
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Bucket-Sort 



♦ Bucket-sort does not use 
comparison 

♦ Let S be a sequence of n (key, 
element) entries with keys in the 
range [0,N- 1] 

♦ Bucket-sort uses the keys as indices 
into an auxiliary array B of 
sequences (buckets) 

Phase 1: Empty sequence S by moving 
each entry 6) into its bucket B[k] 

Phase 2: For i = 0, N - 1, move the 
entries of bucket B[i] to the end of 
sequence S 

♦ Analysis: 

■ Phase 1 takes 0(n) time 

■ Phase 2 takes 0(n + N) time 
Bucket-sort takes 0(n + AO time 

: : : : : : : 



Algorithm bucketSort(S, N) 

Input sequence S of (key, element) 
items with keys in the range 
[0,W-1] 

Output sequence S sorted by 
increasing keys 

B <— array of N empty sequences 

while -^S.isEmptyO 

f^S.firstQ 
(k, o) <— S.removeif) 
B[k].addLast((k, o)) 
for/<-0toN- 1 

while -nB[i].isEmptyO 

f^BUlfirstQ 

(k, o) <— B[i].remove(f) 

S.addLast((k, o)) 



Example 

^ 

♦ Key range [0, 9] 
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Exercise - Bucket-sort 



^Perform bucket-sort on the following 
sequence of numbers: 



(3, 5, I, 9, 3, 7, 8, 8) 
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Stability of sorting 





# Sorting algorithm is stable if the relative order 
of any two items with the same key in an 
input sequence is preserved after the 
execution of the algorithm. 

# Stable sorting algorithms: 



Merge-sort 



■ Bucket-sort 

Unstable sorting algorithms 

■ Quick-sort 
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Sorting Lower Bound 
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Comparison-Based Sorting 

r 



Many sorting algorithms are comparison based. 

They sort by making comparisons between pairs of objects 

Examples: selection-sort, insertion-sort, heap-sort, merge-sort, 
quick-sort 




# Let us therefore derive a lower bound on the running 
time of any algorithm that uses comparisons to sort n 
elements, x lf x 2 , x n . 
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Counting Comparisons 

r 



<t> Let us just count comparisons then. 

# Each possible run of the algorithm corresponds 
to a root-to-leaf path in a decision tree 
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Decision Tree Height 

r 



♦ The height of the decision tree is a lower bound on the running time 

♦ Every input permutation must lead to a separate leaf output 

If not, some input ...4.. .5... would have same output ordering as 
...5. ..4..., which would be wrong 

Since there are n! = l-2 • ... n leaves, the height is at least log (n!) 
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